29 research outputs found
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
In this work, we evaluate 10 open-source instructed LLMs on four
representative code comprehension and generation tasks. We have the following
main findings. First, for the zero-shot setting, instructed LLMs are very
competitive on code comprehension and generation tasks and sometimes even
better than small SOTA models specifically fine-tuned on each downstream task.
We also find that larger instructed LLMs are not always better on code-related
tasks. Second, for the few-shot setting, we find that adding demonstration
examples substantially helps instructed LLMs perform better on most code
comprehension and generation tasks; however, the examples would sometimes
induce unstable or even worse performance. Furthermore, we find widely-used
BM25-based shot selection strategy significantly outperforms the basic random
selection or fixed selection only on generation problems. Third, for the
fine-tuning setting, we find that fine-tuning could further improve the model
performance on downstream code comprehension and generation tasks compared to
the zero-shot/one-shot performance. In addition, after being fine-tuned on the
same downstream task dataset, instructed LLMs outperform both the small SOTA
models and similar-scaled LLMs without instruction tuning. Based on our
findings, we further present practical implications on model and usage
recommendation, performance and cost trade-offs, and future direction
Recommending Analogical APIs via Knowledge Graph Embedding
Library migration, which re-implements the same software behavior by using a
different library instead of using the current one, has been widely observed in
software evolution. One essential part of library migration is to find an
analogical API that could provide the same functionality as current ones.
However, given the large number of libraries/APIs, manually finding an
analogical API could be very time-consuming and error-prone. Researchers have
developed multiple automated analogical API recommendation techniques.
Documentation-based methods have particularly attracted significant interest.
Despite their potential, these methods have limitations, such as a lack of
comprehensive semantic understanding in documentation and scalability
challenges. In this work, we propose KGE4AR, a novel documentation-based
approach that leverages knowledge graph (KG) embedding to recommend analogical
APIs during library migration. Specifically, KGE4AR proposes a novel unified
API KG to comprehensively and structurally represent three types of knowledge
in documentation, which can better capture the high-level semantics. Moreover,
KGE4AR then proposes to embed the unified API KG into vectors, enabling more
effective and scalable similarity calculation. We build KGE4AR' s unified API
KG for 35,773 Java libraries and assess it in two API recommendation scenarios:
with and without target libraries. Our results show that KGE4AR substantially
outperforms state-of-the-art documentation-based techniques in both evaluation
scenarios in terms of all metrics (e.g., 47.1%-143.0% and 11.7%-80.6% MRR
improvements in each scenario). Additionally, we explore KGE4AR' s scalability,
confirming its effective scaling with the growing number of libraries.Comment: Accepted by FSE 202
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation
In this work, we make the first attempt to evaluate LLMs in a more
challenging code generation scenario, i.e. class-level code generation. We
first manually construct the first class-level code generation benchmark
ClassEval of 100 class-level Python code generation tasks with approximately
500 person-hours. Based on it, we then perform the first study of 11
state-of-the-art LLMs on class-level code generation. Based on our results, we
have the following main findings. First, we find that all existing LLMs show
much worse performance on class-level code generation compared to on standalone
method-level code generation benchmarks like HumanEval; and the method-level
coding ability cannot equivalently reflect the class-level coding ability among
LLMs. Second, we find that GPT-4 and GPT-3.5 still exhibit dominate superior
than other LLMs on class-level code generation, and the second-tier models
includes Instruct-Starcoder, Instruct-Codegen, and Wizardcoder with very
similar performance. Third, we find that generating the entire class all at
once (i.e. holistic generation strategy) is the best generation strategy only
for GPT-4 and GPT-3.5, while method-by-method generation (i.e. incremental and
compositional) is better strategies for the other models with limited ability
of understanding long instructions and utilizing the middle information.
Lastly, we find the limited model ability of generating method-dependent code
and discuss the frequent error types in generated classes. Our benchmark is
available at https://github.com/FudanSELab/ClassEval
Evaluating and Improving Unified Debugging
Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for the first time to boost both areas. In this way, ProFL also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that repair systems can automatically fix). However, ProFL only considers one program repair system, and it is not clear how other repair systems contribute to unified debugging. In this work, we perform an extensive study of the unified debugging approach on 16 state-of-the-art program repair systems for the first time. Our initial experimental results on the Defects4J benchmark reveal various practical guidelines for unified debugging, such as (1) nearly all 16 studied repair systems positively contribute to unified debugging despite their varying repair capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise, (3) repair systems with more executed/plausible patches tend to perform better, (4) unified debugging effectiveness does not rely on the availability of correct patches, and (5) we propose a new technique, UniDebug++, which localizes over 20% more bugs within Top-1 than state-of-the-art technique ProFL. Furthermore, we extend the above experiments to make the following additional contributions: we (6) further perform an extensive study on 76.3% additional bugs and confirm that UniDebug++ again outperforms ProFL by localizing 185 (out of 395) bugs within Top-1, (7) investigate the impact of 33 SBFL formulae and observe UniDebug++ consistently improving upon all formulae, (8) demonstrate that UniDebug++ can substantially boost state-of-the-art learning-based method-level fault localization techniques, (9) extend unified debugging to the statement level for first time and observe that UniDebug++ localizes 78 (out of 395) bugs within Top-1 and outperforms state-of-the-art learning-based fault localization techniques by 30%, and finally (10) propose a new technique, UniDebug+*, based on detailed patch statistics, to further improve upon UniDebug++
Development and Hybrid Position/Force Control of a Dual-Drive Macro-Fiber-Composite Microgripper
This paper reports on the development, implementation and hybrid control of a new micro-fiber-composite microgripper with synchronous position and force control capabilities. In particular, the micro-fiber-composite actuator was composed of rectangular piezoelectric fibers covered by interdigitated electrodes and embedded in structural epoxy. Thus, the micro-fiber-composite microgripper had a larger displacement-volume ratio (i.e., the ratio of the output displacement to the volume of the microgripper) than that of a traditional piezoelectric one. Moreover, to regulate both the gripper position and the gripping force simultaneously, a hybrid position/force control scheme using fuzzy sliding mode control and the proportional-integral controller was developed. In particular, the fuzzy sliding mode control was used to achieve the precision position control under the influence of the system disturbances and uncertainties, and the proportional-integral controller was used to guarantee the force control accuracy of the microgripper. A series of experimental investigations was performed to verify the feasibility of the developed microgripper and the control scheme. The experimental results validated the effectiveness of the designed microgripper and hybrid control scheme. The developed microgripper was capable of precision and multiscale micromanipulation tasks
Experimental Identification and Vibration Control of A Piezoelectric Flexible Manipulator Using Optimal Multi-Poles Placement Control
This paper presents experimental identification and vibration suppression of a flexible manipulator with piezoelectric actuators and strain sensors using optimal multi-poles placement control. To precisely identify the system model, a reduced order transfer function with relocated zeros is proposed, and a first-order inertia element is added to the model. Comparisons show the identified model match closely with the experimental results both in the time and frequency domains, and a fit of 97.2% is achieved. Based on the identified model, a full-state multi-poles placement controller is designed, and the optimal locations of the closed loop poles are determined where the move distance of the closed loop poles is the shortest. The feasibility of the proposed controller is validated by simulations. Moreover, the controller is tested for different locations of the closed loop poles, and an excellent performance of the optimal locations of the closed loop poles is shown. Finally, the effectiveness of the proposed controller is demonstrated by experiments. Results show that the vibrations of the expected modes are significantly diminished. Accordingly, multi-mode vibrations of the manipulator are well attenuated
Experimental Identification and Vibration Control of A Piezoelectric Flexible Manipulator Using Optimal Multi-Poles Placement Control
This paper presents experimental identification and vibration suppression of a flexible manipulator with piezoelectric actuators and strain sensors using optimal multi-poles placement control. To precisely identify the system model, a reduced order transfer function with relocated zeros is proposed, and a first-order inertia element is added to the model. Comparisons show the identified model match closely with the experimental results both in the time and frequency domains, and a fit of 97.2% is achieved. Based on the identified model, a full-state multi-poles placement controller is designed, and the optimal locations of the closed loop poles are determined where the move distance of the closed loop poles is the shortest. The feasibility of the proposed controller is validated by simulations. Moreover, the controller is tested for different locations of the closed loop poles, and an excellent performance of the optimal locations of the closed loop poles is shown. Finally, the effectiveness of the proposed controller is demonstrated by experiments. Results show that the vibrations of the expected modes are significantly diminished. Accordingly, multi-mode vibrations of the manipulator are well attenuated